The Polygraph Place

Thanks for stopping by our bulletin board.
Please take just a moment to register so you can post your own questions
and reply to topics. It is free and takes only a minute to register. Just click on the register link


  Polygraph Place Bulletin Board
  Professional Issues - Private Forum for Examiners ONLY
  Practice exams for federal agent, state/local LE, or contractor positions (Page 3)

Post New Topic  Post A Reply
profile | register | preferences | faq | search

This topic is 3 pages long:   1  2  3  next newest topic | next oldest topic
Author Topic:   Practice exams for federal agent, state/local LE, or contractor positions
dkrapohl
Member
posted 01-25-2012 06:44 AM     Click Here to See the Profile for dkrapohl   Click Here to Email dkrapohl     Edit/Delete Message
Gordon:
The Center is a step or two removed from the DQ/SR stats of the 26 customer agencies. The closest documented numbers we have goes back to the days almost a decade ago when the DoD produced its annual report to Congress. Perhaps you'll remember better than I, but it seems that the percentage of unresolved SR cases was in the low single digits. It's pretty hard to make a blanket statement about the DQ/SR rate across government. Let me throw in another data point. Many agencies do periodic re-screening of employees. If the SR rate were excessive, those agencies would be drowning in unresolved polygraph cases since they conduct thousands of these periodic tests every year. This problem has not yet been reported. For this reason alone I would offer that the false positive argument is misrepresented by many.

On an unrelated issue, it was good to hear you were coming to the memorial service for Paul Menges next month. Paul was a good man, did a lot of good for his country and fellow man. We really miss him. I look forward to seeing you there.

Don

IP: Logged

rnelson
Member
posted 01-25-2012 11:11 AM     Click Here to See the Profile for rnelson   Click Here to Email rnelson     Edit/Delete Message
Dan,

All kidding aside, we both know nothing is solid gold (heck even solid gold is often not).

I'm still thinking this is partially a stick-in-the-eye exercise. Otherwise I cannot imagine what you gain by blowing yourself up like this.

So thanks for bringing some animation to this comatose forum. BTW it was good to meet you face to face finally. I enjoyed dinner. Well, Chili's is the same everywhere but it was great to talk. I hope you can meet my lovely bride Nayeli when we are next in New Hampshire. She has education and training like you and I – in psychology and polygraph.

As for the statistical analysis (alchemy), I'm just showing that what we presently know about polygraph accuracy, viewed through an analysis of the mathematical potential of a multi-issue test, indicates that Don is correct that the FP arguments seem to have been misunderstood or misrepresented. It appears quite understandable and predictable that a predicable portion of people will not pass their polygraphs. This is not a bad thing. It is probably a good thing.

As with everything, statistical alchemy is not infallible. Different ways of analyzing things can lead to slightly different conclusions. This is because nothing is perfect. There are in fact many strategic compromises that get made in research and statistical analysis, and in test development/validation. Some procedures have more power than others. Some procedures may seem more directly applicable and generalizable.

Someone once said here are three kinds of lies: 1) lies, 2) damn lies, and 3) statistics. I've heard this was Mark Twain, but it is also attributed to Benjamin Disraeli.

This is exactly why scientific professions and editors of scientific journals require access to the data – we like the ability to recalculate and verify the results, and evaluate the data for ourselves in alternative ways. It relieves our sense of anxiety that something funny and self-serving could be happening.

One thing Matte did well was to publish both mean and standard deviations in his doctoral dissertation. He also published all the scores, and it is clear that the scores calculate to the mean and variance as reported. Unfortunately he gave only one set of scores so we could not recalculate the near-perfect .99 Pearson correlation coefficient. When we have within-study correlations of inter-scorer numerical scores that are this extraordinarily high it is quite unexpected to find significant between study differences in the mean numerical scores. There are always several possible causes for this significant interaction differences, including the case selection method and use of exams from a single highly experienced expert. It also remains possible there were subtle differences in the ways the testing procedures or scoring procedures were applied. We cannot know with certainty without access to more of the data. Regardless of all this, Matte did much more than some others. We had to obtain data and calculate many standard deviation statistics for the first time (this having been neglected by some researchers).

The recent meta-analysis includes, in the appendices, enough information that someone could recalculate any of the two-way unbalanced anova contrasts and verify the statistical analysis. Someone could recalculate all of the weighted meta-analytic results, along with the distributions and confidence intervals for the study results. If someone took the time to re-obtain all the data - which we know is available - the mean, variance, and sampling Ns can also be verified for most samples. As was the case with the NRC report, we came away from the meta-analysis, with one of our largest issues of concern being related to sampling methodology, and the c

What concerns me most, in a very practical sense, is that some polygraph examiners, though they believe in themselves, still have trouble believing in the polygraph test. In fact, they have trouble recognizing or admitting that the polygraph test is a scientific test. I am beginning to suspect that this is driven by fear or insecurity – that the polygraph maybe isn't scientific, or, more a personal, and un-necessary, concern that they might not be able to understand the science behind the polygraph. A darker concern involves the self-centered fear that examiner expertise or some individual prowess (shamanism anyone?) becomes devalued as we demonstrate that polygraph accuracy can be reproduced by any reasonably smart person that can learn to understand and apply the principles of scientific testing.

Scientific polygraph not just a matter of court or court admissibility. It is a social and ethical concern – as you have illustrated. Polygraph tests are used as a decision support tool in matters that affect both individuals and the community (and nation). I will continue to argue that our (law enforcement and governments) primary ethical obligation is to the local community and nation, and secondarily to the individual. Of course, clinicians – in medicine and psychology – have a slightly different ethical mandate that tells them the primary obligation is to the individual (unless the individual is planning to harm another individual).

The point is this: if we use the polygraph as an investigative and decision support tool then we are eventually going to be asked to account for ourselves. Failure to account for ourselves in terms of science (and that means measurement and statistical alchemy) will result in us loosing credibility with the people whom we want to implement policies (and funding) to make use of the incremental validity that polygraph can add to risk assessment and risk management decisions.

If we continue to develop the admittedly imperfect polygraph test according to the scientific evidence, then it will slowly improve over time. Improvement of imperfect methods is an indicator of a legitimate field of science. All fields of science are imperfect and need improvement. Which is why we are concerned with the reporting of perfect accuracy. Sure maybe one highly experienced expert can achieve perfect accuracy when selecting cases that are confirmed via confession. But is it reproducible by the average smart person (oxymoron I know). Probably not. Most people will have imperfect results. So what does our profession achieve by the publication of perfect results. Not much. The gain is for the proprietary seller of a method. Probably the gain will be temporary – in that reasonable people will truly believe in perfect accuracy for long. But the world is still big enough that could makes for effective marketing if one could convince some of the more naïve folks that perfect or near perfect results are possible. In a scientific sense, samples of perfect accuracy are of little use. Perfect results do not help to make realistic calculations of the probability of error. Perfect results claim to be “certainty” and not “probability.” Perfect results claim to have found some psychophysiological mechanism that is uniquely and perfectly correlated with lying and nothing else. This is absurd, of course, because all psychological and physiological functions are correlated with multiple activities and causes.

Nevertheless, polygraph researchers have, in the past, reported perfect, or near perfect results. Realistic assessment suggests those studies are flawed and confounded in identifiable ways. More realistic studies have consistently reproduced imperfect results that are significantly greater than chance. But the reporting of perfect results is confusing. Some people prefer to pretend, and simply believe in perfect accuracy. Others, scientific critics, scoff at us for being so silly to try to claim perfection. In a larger sense, the reporting of perfect, unrealistic, and unreproducible results interferes with our ability to identify and chart improvements in the scientific accuracy of our test. If you look at the trend of studies since about Abrams (1971), the polygraph would seem to have been nearly perfect about 40 some years ago, and has become less accurate over time. Is that because we have goofed up the accuracy of the polygraph by attempting too much science??? Probably not. More likely our estimates of polygraph accuracy have improved. There are indications that the polygraph has improved over time, but they are difficult to tease out because of all the confounded results and information that is difficult to reconcile.

Other fields of science have struggled with this also. As a result, we see increased emphasis on chasing down and investigating the reasons for spurious and irreconcilable results. Sometimes they find a contaminated lab environment, and sometimes they find some ethical misconduct in the form of proprietary or professional interests prompting someone to cook their data. Sometimes they find that someone simply overlooked something that confounded he results. The Internet, as a research tool, has changed things a lot, and a growing number of troublesome scientific studies are now retracted from publication each month as scientific professions seek to police themselves more carefully. Bad science, like an STD, can be a gift that keeps on giving.

Something that occasionally interferes with progress in the science of lie detection is the unproductive repetition of vacuous and thoughtless hyperbole re the inability to study the polygraph scientifically. If we keep saying it can't be done then we will not try. Guess what? It can be done. It is being done. We can describe the scientific mechanisms that make the test work, and we can actually start to study and describe the potential accuracy of polygraph tests conducted on multiple independent targets. Its not simple, and it requires that we continue to tolerate uncertainty. But uncertainty in scientific fields is measurable – because it is just another form of probability (statistical alchemy). We will, over time, continue to improve out estimates of accuracy for polygraph screening tests. However, if we don't start somewhere there is nothing to improve from. We can wait for perfection and certainty forever... or we can start now with the imperfect data and imperfect research methods that we have available for use today.

If you don't want to learn anything about the science of polygraph then you don't have to. You are not obligated. Each of us field examiners is obligated only to do the best job we know how to do. That will mean using the best training and methods we have access to. And it will mean adhering to standards that are intended to protect the community, the profession, and the professional.

But I will ask you this: please don't prevent or discourage others from advancing the profession.

Not improving, and neglecting to learning or implementing any knew knowledge or improved procedures, and anchoring all of our wisdom and answers to one individual, will be interpreted by outsiders as an indicator that the polygraph profession is a cult-like pseudoscience.

If we do it right, then we should be able to achieve professional standards and social policies re polygraph testing that will lead to greater customer and community satisfaction, and increased professional and job security for those of us who strive to increasing individual justice, community safety and national security (and pay our bills and feed our families) by working in the field of scientific credibility assessment.

If we don't do it right, if we sell “i-believe-in-my-special-abilities-to-detect-deception-but-not-the-replicatable-science-of-polyraph,” then all of our efforts to promote the profession will ultimately degrade to professional sniping. Judgments about whose results are correct and whose are incorrect will be made not by discussion polygraph methods and data, and decisions that affect individual justice, community safety and nation security will not be made according to the merits and deficiencies of the examination data. Instead judgments will result from beauty-pageant decisions about who has the prettier resume, who worked for the sexier agency, who was anointed by the larger guru, or who has the more forceful or charming personality – or who has the most political and economic influence.

I would prefer to think that it is our job is to help reduce corruption and help people to make better decisions.

Gordon Barland's question is a good one: does failure/DQ equate to the FN rate? I don't think Don addressed this. If the failure/DQ rate is a function of test sensitivity, and if test sensitivity is significantly greater than chance, then I suspect the FN rate does vary as a function of the DQ/failure rate. This would be illustrated by the Gaussian-Gaussian signal detection model (described by Barland, 1985). However, there is reason to also suspect that the relationship will be non-linear. In that case there would be some point of optimization that should be studied – but that will require more statistical alchemy...

Anyway, Dan I hope we can have dinner again.

Peace,

r

[This message has been edited by rnelson (edited 01-25-2012).]

IP: Logged

Bob
Member
posted 01-25-2012 12:07 PM     Click Here to See the Profile for Bob     Edit/Delete Message
Ray- I do enjoy reading your posts, and very well said.

Now if I only understood statistics, I'd have it made,
bob

IP: Logged

Gordon H. Barland
Member
posted 01-25-2012 06:06 PM     Click Here to See the Profile for Gordon H. Barland     Edit/Delete Message
Ray, I mis-stated my concern. What I intended to say is that I think many of the people critical of the polygraph tend to equate the high failure/DQ rate of some agencies with the false positive rate. But for those examinees who are SR and make disqualifying admissions in the post test, the SR outcomes are effectively true positives - though perhaps not in the eyes of some of the examinees. Moreover, many disqualifying admissions occur in the pretest. Regardless of whether charts are run or not, and even if the charts are NSR indicating that the admissions were correct, these pretest disqualifications are lumped together with the "50% SR/DQ rate." It would seem to me that one of the key statistics that needs to be reported in order to estimate the FP rate is the percentage of "SR chart outcomes with no disqualifying information," as that would likely give a reasonable estimate of the ceiling for FP errors. Of course, since the Internet advice to dummy up has become ubiquitous, the true FP rate could still be significantly less than the ceiling.

Don correctly interpreted what I had intended to say, though I'm not familiar with any specific figures regarding the number of SR/DQ exams, whether confirmed by admissions or not.

Peace.

Gordon

------------------

[This message has been edited by Gordon H. Barland (edited 01-26-2012).]

IP: Logged

Dan Mangan
Member
posted 02-01-2012 04:14 PM     Click Here to See the Profile for Dan Mangan     Edit/Delete Message
Ray,

It’s like I said when we spoke on the phone a month or so ago…you practically have to firebomb this board to get any real action.

I clam up for a week and the cobwebs come back.

But your last post, as passionate and persuasive as it may be to some, ignores some glaring soft spots. Many of the criticisms (of polygraph in general) contained in the 2003 NAS report still apply. I don‘t have the text handy, but a few spring to mind: The weak argument for the CQT; the lack of an identifiable lie response; how fear of failure can mimic the so-called lie response; the myriad of individual variables; countermeasures, etc.

Speaking of the 2003 NAS report, they (NAS) made clear a caveat that an absolute percentage reflecting specific-incident polygraph test accuracy should not be extrapolated, but the APA has continually glommed on to the 89% figure.

That’s just about the same number the research committee came up with this time around, is it not? With that kind of accuracy you should be jumping all over Maschke’s countermeasure challenge. It would make for a great PR stunt, taking GM to the woodshed. But it will never happen, because…

Back to the recent meta-analysis: You make it sound like polygraph has been pronounced “scientific” by MIT, Underwriter’s Laboratories or even Consumers Union. But it’s an inside job from top to bottom.

Let’s take a look at the source data used in your much-ballyhooed meta-analysis..

What we have here is a small group of polygraph activists who submitted their own surveys at the behest of a self-serving trade group in a quest to scientifically legitimize polygraph.

* There was no independent scientific oversight or disinterested controlling authority.
* The inclusion of laboratory studies likely downplays the frequency of FP and FN results that occur in real-world tests.
* There was no vigorous or even measurable testing to gauge how polygraph fares against countermeasures.

I doubt that kind of approach will win over many skeptics.

Going in a different direction:
Somewhere in this thread, Don talked about running in different circles. Indeed those circles exist. If you want to see how polygraph “works” in the real world, try this:

1. Randomly choose 300 examiners from the APA rolls, including a proportionate number from the international side.
2. Solicit those examiners to perform a “test” like CBS’ 60 Minutes did with their 1980s sting.
3. Winnow the field to 100 examiners, based on fee (lowest fee gets the gig).

Logistics aside, wouldn’t that be a more accurate reflection, in a practical sense, of how polygraph really works?

I remember an exercise Don led in 2004 at a Maine Polygraph Association event… Don passed out sets of charts from confirmed cases and had attendees score them. Initial blind scoring accuracy was about 50/50. Don then lectured for a good chunk of the day on scoring, emphasizing the defensible dozen. Near the end of the session, he passed out more confirmed cases for scoring. How did the enlightened souls fare? Not so good: Accuracy remained at about 50/50. Not very reassuring, and perhaps a strong argument for reliance on automated scoring. But then there’s that pesky artifact problem…

Pre-test bias is yet another factor, and it deserves its own thread.

Polygraph is a great investigative and interrogation tool, no doubt about it. But, in my view, polygraph should not be used in court and/or where an individual’s liberty is at stake, or as a deciding factor when assessing a troubled marriage.

Even if there is the most slender of bona fide scientific threads that keep it from all unraveling, the sheer number of variables in a polygraph test negates the scientific half of the tortured equation.

Ray, I look forward to having dinner with you again. (Some background: Ray treated me to dinner at Uno‘s -- not Chili’s -- not too long ago when he was in NH. Ray generously offered to foot the bill for the restaurant’s new Webb-Todd Meal Deal: All you can eat and unlimited alcohol.) Yes, we will do it again.

Meanwhile, we will have to agree to disagree.

You remind me, in a good and admirable way, of some of the doggedly determined engineers I worked with when I was in industry. To them, nothing was out of reach, solution wise.

But some problems are unsolvable and some things in life defy definition.

Sometimes alchemy is nothing more than...alchemy. But if you want to believe, have at it.

Polygraph is far closer to shamanism than it is to science. Always has been, always will be.

Best,
Dan

IP: Logged

Barry C
Member
posted 02-01-2012 04:52 PM     Click Here to See the Profile for Barry C   Click Here to Email Barry C     Edit/Delete Message
The NAS panel certainly weren't insiders, and they "pronounced" polygraph as a science. (If they hadn't, they would be saying it couldn't be studied.) Whether they did so or not is irrelevant. It's science because it's science - not because somebody declares it to be so. (If you're wondering what they said, they referred to polygraph as a subfield of forensic science.)

quote:
Speaking of the 2003 NAS report, they (NAS) made clear a caveat that an absolute percentage reflecting specific-incident polygraph test accuracy should not be extrapolated, but the APA has continually glommed on to the 89% figure.

The reason you can't call any figure the "absolute percentage" is because we are dealing with the issue from a scientific perspective. That is we are estimating a population parameter (or parameters) by looing at samples of the population. Thus, our estimate(s) are going - unless we're really lucky - to be off. That's why you see confidence intervals around any estimate (and even those could be off). Of course, that's true of all estimates.

The fact that "insiders" and outsiders came up with accuracy profiles that are consistent, is a plus - not a negative. It's possible to compare the results of "outsider" studies that didn't meet the standards for inclusion and see how they compare. That's a fair criticism, and there is a method of testing it.

How does one win with GM? As Gordan Barland pointed out somewhere along the way I believe, it's a no-win scenario for polygraph, and GM knows it.

quote:
If you want to see how polygraph “works” in the real world, try this:

1. Randomly choose 300 examiners from the APA rolls, including a proportionate number from the international side.
2. Solicit those examiners to perform a “test” like CBS’ 60 Minutes did with their 1980s sting.
3. Winnow the field to 100 examiners, based on fee (lowest fee gets the gig).

Logistics aside, wouldn’t that be a more accurate reflection, in a practical sense, of how polygraph really works?


I doubt it. You're taking a non-random sample from a random sample, so whom would it represent? It might represent the worst or the best. I'm not sure, but nobody could be, and that's a problem. It's also a problem when "experts" are the only people conducting tests in a study. What do those numbers represent? They represent the average accuracy of experts....

IP: Logged

dkrapohl
Member
posted 02-01-2012 08:19 PM     Click Here to See the Profile for dkrapohl   Click Here to Email dkrapohl     Edit/Delete Message
Dan:
I have done the blind scoring exercise in perhaps a couple dozen settings or more in the US and elsewhere. In none of them did we have chance-level accuracy the first time around, and group accuracy has always improved when we discussed how scoring should be done. I usually keep my spreadsheets of examiner decisions, and if my Maine data is still around and shows you to be right about dismal accuracy or lack of improvement, dinner is on me plus a public acknowledgment. What I have found is that accuracy does vary as a function of where I do the exercise (Northwest Association is the best so far), but the Mainiacs did well as I recall.

A more reasonable estimate of accuracy than that 6-case demo might come from the 30 or so blind scorings I've collected on 100 confirmed field cases. Collective accuracy homes in on the same range as the NAS estimates from their ROC analysis: upper half of the 80s. Admittedly, the scorers are all self-selected, but the charts were a fairly large random collection, the good, the bad, and the very ugly according to feedback I get from the scorers. No cherry-picking of pretty charts. It is my belief that when lots of independent accuracy estimates come to the same conclusion (like those for CVSA showing chance-level accuracy) it's more likely to be true. It certainly does not make the case that polygraphy is all chicanery.

Let me return to an unanswered question from several posts ago. I am still confused about the mismatch between your published research findings of 100% and your opinion about polygraph accuracy. Perhaps its just your grenade-throwing approach to invigorating the discussion, but if you really want to get the lurkers involved, explain how your research got the results it did. It remains one of the great mysteries of recent times.

Hope to see you at APA. Ray will buy the drinks (throwing you under the bus, Ray).

Don

IP: Logged

clambrecht
Member
posted 02-02-2012 02:46 AM     Click Here to See the Profile for clambrecht   Click Here to Email clambrecht     Edit/Delete Message
The last few comments brought this lurker out.
Quite simply,I do not understand the significance of the above stats. I am not doubting or criticizing because I am fully aware of the hard research that has been done so please educate me if am wrong:

A high percentage of examiners correctly categorizing confirmed charts is informative for internal, scoring consistency purposes. Yet it obviously says nothing of how often examinees produce "false" charts. Is this the same measurement promoted on the APA website in the meta analysis of 2011? (The "FAQ" link on the APA site for that analysis is broken).

When people ask about accuracy, they really are asking : "I trust you can score and classify charts, ESS has made it so easy a caveman can do it...I want to know out of 100 tests, how often will false charts be produced?...how many times will you interrogate an innocent man and pat the guilty on the back?" I am probably missing something and will humbly accept correction, but don't these decision accuracy studies miss this point?

If 100 doctors are shown 1000 MRIs of confirmed brain tumors and 80 doctors correctly see the tumors, this doesn't mean MRI exams work 80% of the time...or detect tumors. (I am sure this analogy falls short somewhere... but it is past my bedtime! fire away if it needs help)

Science/Art debate: The polygraph is more akin to an experiment/test of other sciences, not a science in of itself. We are all testing the hypothesis that liars produce distinctive charts from the truthful. Critics do not believe this. We have debates often on techniques and formats because we are all "running experiments" independently -copying others who have done research months or years ago.

Finally, the stakeholder - researcher comments above are important. As a fairly new examiner, I was initially leery of some researchers because of obvious appearances of conflicts of interest. Yet all quality industries conduct and hire "in-house" researchers. The researchers in polygraph we have on this board and others make it transparent that they follow established guidelines, utilize peer review, and seek critique.
Yet, it would be nice if there were more independent research to appease critics. Do any of the polygraph associations or instrument makers offer grants to universities to fund independent research? Offer funding, equipment and examiners, and let students/faculty design tests and publish results in their own discipline's journals.

[This message has been edited by clambrecht (edited 02-02-2012).]

[This message has been edited by clambrecht (edited 02-17-2012).]

IP: Logged

dkrapohl
Member
posted 02-02-2012 06:37 AM     Click Here to See the Profile for dkrapohl   Click Here to Email dkrapohl     Edit/Delete Message
Clambrecht:
Welcome to the discussion. You have made some excellent points, and I hope to see you participate more often.

Let address at least some of your questions as best I can. First, false charts. How often do examinees produce physiological patterns that are contrary to their veracity? In the current state of knowledge, it seems that this certainly happens, and at percentages that both polygraph critics and advocates seem to hate. Looking first at the carefully conducted research at the University of Utah, in single-event testing it's about 5%-10%, plus or minus. For field research we have the work of Avital Ginton, who looked at paired testing where one of the examinees had to be lying. He found false charts to be approaching 20%, though the testing methodologies were not described. A best guess is that the real answer is somewhere in between. If so, this places the estimate somewhat near what the NAS and others have found. The converging lines of evidence are another reason to support the validity of polygraphy.

As to your question about grants to researchers, the NCCA has had a grant and dissertation program for at least 20 years. Not many takers on researching polygraph decision accuracy in the past decade or so because there seems to be an implicit consensus that there isn't a lot more to discover. Lots of other stuff to see on the NCCA site, though.

Gotta go jump into traffic. Hope there is an answer to your questions here somewhere.

Don

IP: Logged

cpolys
Member
posted 02-02-2012 09:56 AM     Click Here to See the Profile for cpolys     Edit/Delete Message
Clambrecht,

The links on the APA website for the FAQ and Executive Summary have been corrected. Thanks for making note of the problem.

Marty

IP: Logged

rnelson
Member
posted 02-02-2012 10:39 AM     Click Here to See the Profile for rnelson   Click Here to Email rnelson     Edit/Delete Message
clambrecht:

I like the cave man thing. Its fun.

I also like to say that the polygraph is half science and half art. It's also half common sense, half voodoo and half magic fairy dust. That's a lot of halfs.

What is most important is to think beyond the 3 second sound-bite. This kind of explanation makes us feel OK, but does not satisfy the questions of outsiders and critics. They want us to account for the science part of polygraph. If we don't do that, then the polygraph would be all art.

That is what we are attempting to study.

quote:
A high percentage of examiners correctly categorizing confirmed charts is informative for internal, scoring consistency purposes. Yet it obviously says nothing of how often examinees produce "false" charts. Is this the same measurement promoted on the APA website in the meta analysis of 2011? (The "FAQ" link on the APA site for that analysis is broken).

When people ask about accuracy, they really are asking : "I trust you can score and classify charts, ESS has made it so easy a caveman can do it...I want to know out of 100 tests, how often will false charts be produced?...how many times will you interrogate an innocent man and pat the guilty on the back?" I am probably missing something and will humbly accept correction, but don't these decision accuracy studies miss this point?


Great question. A subtle one, but very important.

Do blind scoring studies tell us about the proportion of examiners that score correctly when the data itself contains useful proportions of diagnostic and error variance (signal and noise)?

vs this:

Does the data contain useful proportions of diagnostic and error variance? What proportion of cases contains problematic proportions of diagnostic and error variance.

This is exactly the reason that scientific thinkers are of the mindset that studies reporting perfect accuracy are of little actual value. That is, until we find Pinnochio's Nose (some physiological function that is uniquely correlated with lying and nothing else) - which they physiologists tell us just ain't gonna happen. All human physioloigical responses serve multiple functions (or it would be a lot less efficient).

Samples with no errors are probably the result of some sampling confound (a procedural issue that introduces the possibility of an alternate explanation for the study results). Samples without errors are devoid of error variance.

All representative samples contain both diagnostic and error variance. If we are scoring valid physiological features then our data will have a lot of diagnostic variance (signal) and only a little bit of error variance (noise). But error variance and noise will be there because there is no such thing as a perfect test that works perfectly with everyone every time.

Another version of your question: Do real-life field polygraph examinees every produce polygraph data that will give you the incorrect result even when you score and interpret the data according to the correct procedures?

Answer: Of course. There is no such thing as a perfect test. And we know from our research that it does happen.

Better question: How often does that happen?

To begin to answer this question ("begin to answer") we will need both field and laboratory studies. This is because field studies require confirmation evidence which is non-random and therefor represents the population in a questionable way. Field samples will be devoid of error variance because you cannot have confirmation evidence on an error - so errors will be excluded and field samples are at risk for over-estimating accuracy.

Laboratory studies are more easily constructed of randomly assigned guilty/innocent persons, and therefor don't suffer the unknowns about representativeness. Errors will be included in laboratory samples. However, laboratory studies have weaker ecological validity than field studies and many have different psychological aspects than field exams.

These are complications. They do not negate the value and importance of either field or lab studies. Both are important. Interestingly field studies do tend to result in slightly higher accuracy than lab studies. Which are correct? Answer: the difference is not significant and therefor meaningless. The answer probably exists in the convergence of lots of field and lab studies.

When we have lots of sample data, and lots of examiner scores, then you can, and should, assume that present error estimates (and accuracy estimates) do include this condition. Errors will include some conditions in which the examiner got the wrong result because of scoring and intepreting the data incorrectly, and will also include some conditions in which the error occured because the case contained a higher than normal proportion of error variance while the examiner scored it correctly.

Of course it will be very difficult to ever know the exact proportion of times in which each error condition occurs. One way to study this is through comparison of the scores of multiple human scorers (which is why we like more than one scorer in all studies), and by comparing human scores/results with automated scores/results.

Ultimately the proportion does not matter. Both types of errors happen. Errors occur due to human procedural/scoring error. Errors occur due to uncontrolled variance of physiological response. But human procedural errors will be reduced through automation.

Research on facial expression as a means of deception detection is a good example. Researchers have to 1) find the features, and 2) learn how to extract or code (transform the features to numerical scores. To do this they train scorers. Then they have to verify their scorers abilities, including calculating the standard errors. It takes many hours to train the scorers (but graduate and undergraduate interns are cheap labor and have almost no rights). Then they do validation studies on the testing and decision model. Otherwise they are not testing the model they are testing the scorers. Later they begin to automate the validated procedure to reduce both the human training demand and the potential for procedural errors.

The presence of error variance (uncontrolled variance) in all data is exactly why it bothers us so much when researchers do not report all data or do not provide all data for review. For example: what if they give us only the data for the cases they scored correctly (excluding inconclusive or error case data)? Of what value are the study results when sample mean scores for deceptive and truthful cases are reported without the scores of the error cases? It becomes a problem. We want to study the error variance so that we can determine if there are ways of increasing our ability to control and account for it. Then test accuracy will increase (though it will probably remain forever short of perfection).

The point is that we have to be interested in the exact questions that you have raised. What proportion of the data we want to score is actually error/uncontrolled variance?

Part of the value of having many scorers looking at the same case data (as opposed to studies with a single scorer) is that the inverse of the reliability statisics can start to illustrate the answer.

The recent meta-analysis showed that decision agreement is about 90.1% (standard error = .082), with about 10% disagreement. So, if all disagreement is attributed to human procedural error, a little bit of statistical alchemy says we can be 95% sure that the actual rate of human procedural error will be less than 23.4%. The average will most probably be lower than this.

Decision errors are just north of 10% (actually 13.1%) with a standard error of 3.6%. If we assume that all decision errors are attributable to error variance (examinee data is bad and the examiner scores it correctly) then the same bit of statistical alchemy (1-.869)*(1.6449*.036) tells us that we can be 95% sure that examinee physiological responses will be wrong(inconsistent with our theory) less than 19% of the time. Again the average will most probably be lower than this.

What you will notice is that we seldom, in science, have absolute answers. We have estimates, standard errors and confidence ranges.

What Gordon Vaughan alluded to was the concept of floors and ceilings - which allow us to begin to have confidence that the actual value will not be below or above a certain level.

We start by asking good questions. Then we start to answer those questions with the data we have at present. Later we hope to have more data to refine and improve our estimates.

So, it's a process like triangulating our present position in a rather wide area, and then narrowing it down as we gain more information.

The meta-analysis did not miss this point, but assumes that FP errors result from both reliability/procedural errors and bad data. What I have shown here is that even though we cannot give definitive answers we can partition the available data to determine the range of error likely to be caused by the factors you describe.

r


------------------
"Gentlemen, you can't fight in here. This is the war room."
--(Stanley Kubrick/Peter Sellers - Dr. Strangelove, 1964)


[This message has been edited by rnelson (edited 02-02-2012).]

IP: Logged

rnelson
Member
posted 02-02-2012 11:24 AM     Click Here to See the Profile for rnelson   Click Here to Email rnelson     Edit/Delete Message
Dan:

quote:
Ray,
It’s like I said when we spoke on the phone a month or so ago…you practically have to firebomb this board to get any real action.

I clam up for a week and the cobwebs come back.


It's why we need you here, and your commentary. So, for the record, I consider your critical insights to be helpful. There is nothing wrong with pointing out the potential confounds of a study. It is a generally accepted standard for science and research: invite people to criticize the effort and point out what might be wrong with it, and what might be alternative explanations or alternative causes for the results.

quote:
Back to the recent meta-analysis: You make it sound like polygraph has been pronounced “scientific” by MIT, Underwriter’s Laboratories or even Consumers Union. But it’s an inside job from top to bottom.

Let’s take a look at the source data used in your much-ballyhooed meta-analysis..

What we have here is a small group of polygraph activists who submitted their own surveys at the behest of a self-serving trade group in a quest to scientifically legitimize polygraph.

* There was no independent scientific oversight or disinterested controlling authority.
* The inclusion of laboratory studies likely downplays the frequency of FP and FN results that occur in real-world tests.
* There was no vigorous or even measurable testing to gauge how polygraph fares against countermeasures.


We've tried to be really clear about the fact that validity is not a matter for declaration. That is why there is no "grandfathering" of some techniques as "valid."

re: activism and insider work

That concern would be exacerbated if the conclusions we offered were very different than research reviews that were conducted by outsiders and critics. if our results were outliers to the distribution of results from other reviews.

I believe we were fairly rigorous in our efforts. If different results would exacerbate the concern, then results that are not dissimilar to those from external studies might mitigate this concern. Your pointing out the issue now allows everyone to consider all the facts and evidence for themselves and reach there own conclusions. Nothing wrong with that.

I don't think that any of us think we are going to simply declare the polygraph valid, or somehow earn a good-housekeeping-science-seal-of-approval from some over-arching authority. That is not how it happens.

Credibility will have to be earned, and that will happen slowly, one issue and one piece at a time. It will not happen if we take a thin and superficial look at the matter - if we rely on appeals to authority neglect to examining the evidence itself. In that case the only people looking at and interpreting the evidence will be our critics.

The solution will be to ask people to actually respond to the data and evidence itself.

First, it will be our jog to look carefully at the details and then decide if we think those details do or do not tell us something about the levels of accuracy that can realistically and reliably be anticipated from randomly selected field examiners using any randomly selected validated polygraph technique to test a randomly selected normal functioning examinee from any random neighborhood.

Chance? Less than chance? Perfect? Somewhere better than chance and less than perfection? Mid to upper 80s? Now that we have looked at the evidence and described what we found, many more people can take a look at the evidence and reach their own conclusions about what the evidence seems to be telling us.

I will argue that lack of "independent scientific oversight or disinterested controlling authority" is mitigated by the transparency of the process and analysis. Anyone who wants to check analysis can enough information in the report and appendices to re-calculate all of the results. We did as much as we could to include everything that could be verified.

All fields of scientific research are subject to the potential for "publication bias," or "file-drawer-bias," in which the results of published studies tend to be helpful and study results that are not helpful tend not to get published. Findings that are signficant get published. Insignificant findings sometimes don't get published. This not unique to polygraph. The result is the potential for research to over-estimate effect sizes. Meta-analysis can help illustrate publication bias. Ours was a meta-analytic survey, but someone could, using the published data, calculate or graphically analyze the potential for publication bias. I think Barry did this, and it should not be a surprise that there is some evidence that publications may be overestimating accuracy.

Your second asterisk - regarding FP and FN errors being not well represented in laboratory studies - seems to be at odds with the concerns of most others. Field samples that over-rely on confession confirmation, and even extrapolygraphic evidence, are more at risk for systematically excluding FP and FN errors (especially FP) because no evidence can be obtained when it does not exist.

Instead of engaging a personal opinion on this, we took an approach that is supported by evidence: both field and laboratory studies have advantages and disadvantages, and appear to produce results that do not differ significantly.

Your third asterisk item, regarding the lack accuracy estimations in the context of countermeasures, is only partially correct.

We did not included samples of examinees subject to experimental manipulation (e.g, sleep deprivation, low functioning persons, psychotic persons, or persons trained to use CMs. That does not mean that examinees did not attempt CMs. Some examinees probably did in some samples.

Before we can calculate and describe the effect that CMs have on polygraph accuracy we have to first have some reasonable understanding of polygraph accuracy. Otherwise we cannot partition the variance for that effect.

But remember, one definition of countermeasures would include anything that a deceptive examinee does to defeat the test. The problem is that all (or nearly all) deceptive examinees can be expected to be either hoping or doing something to try to pass the test.

To the extent that a number of the studies included in the meta-analysis did include field studies, it follows that we should assume that the deceptive examinees were doing the same: hoping or doing something to try to pass the test. To the extent that the samples are representative of the population, we should assume that the proportion of countermeasures reflects that of the population. We can expect the prevalence of CMs in the sample data to differ from the population to the extent that the samples do not represent the population.

Science is a chaotic process. Do enough of it, then sift and make order of the chaos, and we can begin to understand the consistent meaning of studies that do or do not seem to produce results that are similar or dissimilar. That's what meta-analysis is all about.

For what its worth, the results of this meta-analysis are fairly consistent with several previous scientific reviews:

  • Abrams (1977) 91%
  • Raskin & Podlesney (1979) 90%
  • Office of Technology Assessment (1983) 84%
  • Abrams (1989) 88%
  • Honts & Pederson (1997) 90%
  • Crewson (2001) 88% - diagnostic exams
  • Honts & Raskin (2002) 90%
  • National Research Council (2003) 88%
  • Studies on the Reid Technique (1970s) 87%

The results of this meta-analysis are not consistent with:

  • Abrams (1973) 98%
  • Ansley (1983) 97%
  • Ansley (1990) 98%

Why are the results of these different. The most likely possibility is that these research reviews relied heavily on examiner self-reported accuracy of confession confirmed cases. (Sampling counfounds.) As our research and sampling methods have improved our accuracy estimates have become more realistic.

The value of all this discussion, the details, and the publication is that people can decide for themselves - after they are aware of the details and quality of the evidence - whether the data seem to support or fail to support the scientific validity of the polygraph.

The alternative to an evidence-based conclusion is to invite emotionally driven conclusions or to invite conclusions driven by self-interest.

I believe the evidence-based approach is best for all the things we care about: national security, community safety, individual justice, the future and credibility of the APA and the polygraph profession, and the economic success of professional polygraph examiners.

Regardless, thanks again for the criticism and discussion. It helps us all, when we have the opportunity to consider the evidence from different perspectives, to make more informed decisions about what we think the results really mean.

One thing is certain, the meta-analysis is not the final word or final authority on polygraph accuracy. Its what we know today when we examine the evidence the way we did.

There will be a need for another review of the data at some point in the future.

For now, the meta-analysis does seem to answer some previous un-addressed questions about polygraph accuracy when we use field testing formats and test data analysis models that are in use today. It also suggests that we might be better to place less emphasis on named techniques and schools of thought, and more emphasis on constructing and interpreting good single-issue event specific exams and multi-issue screening according to a more common set of principles that are known to contribute to the formulation of a valid polygraph test.

Emphasizing scientific principles, shelving hypothesis that don't seem to work, and learning to construct research samples that do a better job representing the population will lead to better (more accurate) estimations of polygraph accuracy - and may or may not lead to more accurate polygraphs.

Anyway, I'll call you soon, when I am back home. Tell you about a new shiney object I'm thinking about.

Peace,

r

P.S. I think this might now be the longest thread ever.

P.S.S. you are right it was Uno's not Chilis. I am away too much - all I know is that it was nicer than my favorite restaurant, the Texaco.

------------------
"Gentlemen, you can't fight in here. This is the war room."
--(Stanley Kubrick/Peter Sellers - Dr. Strangelove, 1964)


IP: Logged

Dan Mangan
Member
posted 02-03-2012 09:20 AM     Click Here to See the Profile for Dan Mangan     Edit/Delete Message
Ray,

You make an excellent point that gets to the heart of my main beef...

>>>>First, it will be our job to look carefully at the details and then decide if we think those details do or do not tell us something about the levels of accuracy that can realistically and reliably be anticipated from randomly selected field examiners using any randomly selected validated polygraph technique to test a randomly selected normal functioning examinee from any random neighborhood.<<<<<

Lab studies just don't cut it. What is the examinee vying for? Movie tickets? Beer money? C'mon.

Things are different when that person in the polygraph chair is facing indictment, years of imprisonment, a messy and expensive divorce, or even public scorn for something they did or didn't do.

What's needed are real-life high-stakes cases presented in some kind of double-blind format. Frankly, I'm at a loss to outline how such experiments could be structured. Perhaps someone has an idea or two.

Beyond that, we have to find a way to drive a stake through Maschke's heart, figuratively speaking. Ignoring him is not the answer.

Meanwhile, it doesn’t matter what I think of polygraph (science vs. art vs. shamamism), as your Bonferroni turbocharged and Kruskal-Wallis afterburner-equipped alchemy juggernaut will roll on, bamboozling everyone in its path -- at least on the very friendly home field where gullible denizens abound. Infidels beware!

And let's not forget the money -- an entire industry depends on polygraph’s continued viability: schools, instrument manufacturers, insurance underwriters, private examiners and examiners employed at every level of gummint…they’re all heavily invested.

Polygraph is too big to fail.

Dan

P.S. Looking forward to hearing more about that shiny object. I'm guessing it has two wheels -- and the "gold" package.

IP: Logged

dkrapohl
Member
posted 02-03-2012 11:00 AM     Click Here to See the Profile for dkrapohl   Click Here to Email dkrapohl     Edit/Delete Message
Dan:

Let me return to an unanswered question from several posts ago. I am still confused about the mismatch between your published research findings of 100% and your opinion about polygraph accuracy....please explain how your research got the results it did.

Don

IP: Logged

rnelson
Member
posted 02-03-2012 11:36 AM     Click Here to See the Profile for rnelson   Click Here to Email rnelson     Edit/Delete Message
quote:
Bonferroni turbocharged and Kruskal-Wallis afterburner-equipped alchemy jugernaut

I like it.

Its important not to take oneself too seriously.

Anyway, things like beer money are useful, because, well... they can be converted into beer.

What you are actually talking about is the potential ecological validity of lab studies. Do they adequate model or recreate the examination circumstances encountered in real life?

Do one disputes the assumption that samples constructed of field cases have better ecological validity - they come from real live exams. However, field samples have other problems - like non-random selection that can result in the systematic exclusion of both FP and FN errors, and over-inflate accuracy estimations. Field studies also present impossible difficulties in controlling enough potentially confounding variables that they cannot establish causality. Results of field studies can establish correlation only.

If we remain limited to the hypothesis that polygraph is driven by fear, then it would seem that lab studies cannot work, because, as you point out, why would someone have the same degree of fear.

However, there seems to be a pile of evidence that fear is not what drives the polygraph - else Directed Lies would not work (they do work) and the polygraph would not work with psychopaths, who are known to have low levels of fear conditioning (they don't learn from their consequences). The polygraph does work with psychopaths.

So, fear seems a terribly inadequate hypothesis. When the evidence and hypothesis disagree it is time to upgrade one of them. (Hint: changing the evidence is not OK).

A more complete hypothesis would suggest that the polygraph is driven by a combination of emotion, behavioral conditioning, and cognitive activity.

Then look at the results of laboratory studies - look at the content of the meta-analysis - and don't remain limited to ad hominem arguments. Look at the NRC report. Look at Polina et al. (2005 I think). Look at Crewson (2001). The differences in field and lab studies is not statistically signficant.

So, the argument of worthlessness re lab studies is absurd. In general field studies seem to produce slightly higher accuracy estimates.

It is therefor a non sequitur argument to suggest that polygraph is in-valid or that lab studies are not worthwhile when these supposedly worthless studies produce results that do not differ significantly from the results of field studies and are still significantly greater than chance. The conclusion does not follow the argument.

Instead it appears that lab studies may be a more conservative estimate of polygraph accuracy. Or field studies may be exhibiting the tendency to overestimate. We don't know. Without evidence, all we have is opinion, and it is not acceptable to impose opinion as an answer to an important scientific question (whether lab or field studies are better).

The evidence says that both are important. Both have their advantages. Both lab and field studies have their disadvantages. We therefor need both, and we therefor included both lab and field studies in the meta-analysis.

Challenge: find evidence, not mere opinion, that one is better or produces significantly different result.

No disagreement that it would be great to construct real-life high-stakes studies of a double-blind format. Everyone is at a loss for how to design them. Other fields of research have the same problem - but do not allow themselves to become catatonic over it. The solution is to learn to use sub-optimal research designs. Use different types of research designs. Use a lot of studies. Then aggregate and compare them all and tease-out and start to understand the biases, advantages and disadvantages - and then we can begin to learn about the criterion accuracy of very complicated phenomena like polygraph accuracy (or other large and seemingly intractable research questions).

Look at the content of the meta-analysis (not just the authors and publishers) and you will notice that most all techniques are supported by a combination of laboratory and field studies. In nearly every case the resulting sampling scores of the laboratory and field studies did not differ significantly. that is, the scorers seems to express the same attitude and approach to guilty and innocent participants, and applied the scoring model similarly to the different samples.

Of the 45 different surveys and experiments included in the meta-analsysis approximately 1/2 or more were field studies.

We decided at the outset, based on no significant differences in the results of field and lab studies as reported by the NRC (2003).

More background evidence: Polina, Dollins, Senter, Krapohl & Ryan (2004) in the Journal of Applied Psychology did find that field cases did tend to produce responses of larger reponse magnitude than laboratory cases, for both relevant and comparison questions. However, the comparison of relevant/comparison pair (when we score them) resulted in no signficant differences. Which makes sense - because the transformation is what we call "dimmensionless" (the units of measurement are cancelled out algebraically).

We could, if you are interested, separate the laboratory and field sample distributions of deceptive and truthful scores (separated by grand total for diagnostic exams and subtotals for screening exams), and compare the distributions of scores for laboratory and field samples. Let me know if you'd like to see this, and I will do it.

r

the shiney object is smaller and louder than what you are thinking - but could include a gold package.

------------------
"Gentlemen, you can't fight in here. This is the war room."
--(Stanley Kubrick/Peter Sellers - Dr. Strangelove, 1964)


IP: Logged

Dan Mangan
Member
posted 02-03-2012 04:23 PM     Click Here to See the Profile for Dan Mangan     Edit/Delete Message
>>>>the shiney object is smaller and louder than what you are thinking - but could include a gold package.<<<<

Ah, a pimp gun. Of course.

IP: Logged

rnelson
Member
posted 02-04-2012 04:12 PM     Click Here to See the Profile for rnelson   Click Here to Email rnelson     Edit/Delete Message
haha

A gold plated 1911 would make for difficult concealment.

Not quite like that.

I'll tell you about it when I get a chance. Just found something interesting, but I haven't made a decision yet. I should probably talk to my wife too. Hmm (married life...).

Anyway, you are pointing out things that people should think about and ask about as they digest the results and meaning of a meta-analysis. The results are only as good as the studies. So how good are they and how much can they really tell us? One purpose of meta-analysis is that it allows us to learn more about those studies - how they compare to each other, to the distribution of other results, and to other efforts to tease-out answers to difficult (but not impossible) questions about the accuracy of polygraph screening and diagnostic tests.

One thing to keep in mind is that the meta-analysis is NOT an official list of validated techniques. It is simply what we come up with when we look at the published evidence in this way. Anyone is free to review the scientific literature for themselves and offer their own arguments and conclusions. Some might want to continue to separate field and lab studies. We chose not to do that because the evidence so far has been unconvincing that it makes a real difference.

I'll let you know about the shiney thing. The smarter part of me says I should probably spend the money on kitchen flooring instead of toys.

Peace,

r

------------------
"Gentlemen, you can't fight in here. This is the war room."
--(Stanley Kubrick/Peter Sellers - Dr. Strangelove, 1964)


IP: Logged

Dan Mangan
Member
posted 02-04-2012 08:48 PM     Click Here to See the Profile for Dan Mangan     Edit/Delete Message
Don,

In response to your question, my take on it is this:

A highly experienced and naturally gifted examiner employed a superior technique consistently in the prescribed manner -- to include the superbly artful execution of the technique's proprietary and uniquely subtle psychological nuances -- in order to realize the technique's maximum potential.

He succeeded.

Dan

IP: Logged

Ted Todd
Member
posted 02-04-2012 10:49 PM     Click Here to See the Profile for Ted Todd     Edit/Delete Message
Dan,
WTF?????
Ted

IP: Logged

Ted Todd
Member
posted 02-04-2012 10:49 PM     Click Here to See the Profile for Ted Todd     Edit/Delete Message
Dan,
WTF?????
Ted

IP: Logged

Ted Todd
Member
posted 02-04-2012 10:52 PM     Click Here to See the Profile for Ted Todd     Edit/Delete Message
OK Ralph and Nadine,
Don't fix your web site and the duplicate posting problem. My last post was worth repeating and I thank you for doing it for me.

Ted

IP: Logged

This topic is 3 pages long:   1  2  3 

All times are PT (US)

next newest topic | next oldest topic

Administrative Options: Close Topic | Archive/Move | Delete Topic
Post New Topic  Post A Reply
Hop to:

Contact Us | The Polygraph Place

Copyright 1999-2012. WordNet Solutions Inc. All Rights Reserved

Powered by: Ultimate Bulletin Board, Version 5.39c
© Infopop Corporation (formerly Madrona Park, Inc.), 1998 - 1999.